Abstract:The performance of statistical machine translation is improved by language model. However, the monolingual corpus is not equal to be effectively used by neural machine translation. To solve this problem, a semi-supervised neural machine translation model based on sentence-level bilingual evaluation understudy(BLEU) metric data selection is proposed. The candidate translations for non-labeled data are firstly generated by statistical machine translation and neural machine translation models, respectively. Then the candidate translations are selected through sentence-level BLEU, and the selected candidate translations are added to the labeled dataset to conduct semi-supervised joint training. The experimental results demonstrate the effectiveness of the proposed algorithm in the usage of non-labeled data. In the NIST Chinese-English translation tasks, the proposed method obtains an obvious improvement over the baseline system only with the fine labeled data.
[1] KALCHBRENNER N, BLUNSOM P. Recurrent Continuous Translation Models[C/OL]. [2017-03-28]. http://www.aclweb.org/anthology/D13-1176. [2] SUTSKEVER I, VINYALS O, LE Q V. Sequence to Sequence Learning with Neural Networks[C/OL]. [2017-03-28]. https:// arxiv.org/pdf/1409.3215.pdf. [3] BAHDANAU D, CHO K, BENGIO Y. Neural Machine Translation by Jointly Learning to Align and Translate[C/OL]. [2017-03-28]. https://arxiv.org/pdf/1409.0473.pdf. [4] KOEHN P, OCH F J, MARCU D. Statistical Phrase-Based Translation // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Stroudsburg, USA: ACL, 2003, I: 48-54. [5] BROWN P F, PIETRA V J D, PIETRA S A D, et al. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 1993, 19(2): 263-311. [6] OCH F J. Minimum Error Rate Training in Statistical Machine Translation // Proc of the 41st Annual Meeting on Association for Computational Linguistics. Stroudsburg, USA: ACL, 2003, I: 160-167. [7] SENNRICH R, HADDOW B, BIRCH A. Improving Neural Machine Translation Models with Monolingual Data[C/OL]. [2017-03-28]. https://128.84.21.199/pdf/1511.06709v1.pdf. [8] GULCEHRE C, FIRAT O, XU K, et al. On Using Monolingual Corpora in Neural Machine Translation [C/OL]. [2017-03-28]. https://arxiv.org/pdf/1503.03535.pdf. [9] CHENG Y, XU W, HE Z J, et al. Semi-supervised Learning for Neural Machine Translation[C/OL]. [2017-03-28]. http://www.aclweb.org/anthology/P/P16/P16-1185.pdf. [10] TU Z P, LIU Y, SHANG L F, et al. Neural Machine Translation with Reconstruction[C/OL]. [2017-03-28]. https://arxiv.org/pdf/1611.01874.pdf. [11] TU Z P, LIU Y, LU Z D, et al. Context Gates for Neural Machine Translation[C/OL]. [2017-03-28]. https://arxiv.org/pdf/1608.06043.pdf. [12] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: A Method for Automatic Evaluation of Machine Translation // Proc of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2002: 311-318. [13] TU Z P, LU Z D, LIU Y, et al. Modeling Coverage for Neural Machine Translation [C/OL]. [2017-03-28]. http://aclweb.org/anthology/P16-1008. [14] KOEHN P, HOANG H, BIRCH A, et al. Moses: Open Source Toolkit for Statistical Machine Translation // Proc of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Stroudsburg, USA: ACL, 2007: 177-180. [15] 谢传栋,郭 武.基于困惑度数据挑选的半监督声学建模.模式识别与人工智能, 2016, 29(6): 542-547. (XIE C D, GUO W. Semi-supervised Acoustic Modeling Based on Perplexity Data Selection. Pattern Recognition and Artificial Inte-lligence. 2016, 29(6): 542-547.) [16] BERGER A L, PIETRA S A D, PIETRA V J D. A Maximum Entropy Approach to Natural Language Processing. Computational Linguistics, 1996, 22(1): 39-71. [17] CHEN S F, GOODMAN J. An Empirical Study of Smoothing Techniques for Language Modeling. Computer Speech & Language, 1999, 13(4): 359-394. [18] STOLCKE A. SRILM-An Extensible Language Modeling Toolkit[C/OL]. [2017-03-28]. http://www.speech.sri.com/projects/srilm/papers/icslp2002-srilm.pdf.